Digital Forensic Analysis of Internet History Using Principal Component Analysis
نویسندگان
چکیده
-A modern Digital Forensic examination, even on a small-scale home computer typically involves searching large-size hard disk drive storage, a variety of host and web-based applications which may or may not be known to the investigator, and a proliferation of web-based Internet history artefacts that may be highly significant to showing the motivation of a suspect. Faster keyword searching and larger and more accurate sets of file hashes may point the investigator to relevant artefacts but when dealing with the new or the unknown, or there is a need to holistically profile the activity of the computer, the investigator is left with a manual and labour-intensive investigation. This paper proposes using an unsupervised statistical learning technique called Principal Component Analysis to provide a novel approach to the analysis of Digital Forensic Internet history. The approach groups and analyses artefacts to produce a high-level context view of the timeline data. The paper proposes a Principal Component Analysis approach and the selection of the appropriate number of Principal Components is described using the Scree test method. A case study of the approach is shown, first using a simulated set of data test comprising of 820 Mozilla Internet History artefacts and then using a set of 5900 Internet Explorer history artefacts from real-world browser data. The results of the analysis are presented in a tabular format that provides an accessible overall view of the activity within the timeline. They show a promising approach to effectively and simply represent large quantities of timeline data at a high-level where basic patterns of usage can be determined. Further work on enhancing the proposed approach to include low-level pattern rules is discussed. Keywords--Digital Forensics; Internet History; Principal Component Analysis
منابع مشابه
ارزیابی تطبیقی کارایی ساختار فراداده نظامهای شناسگر دیجیتالی
The main solution to the problems of persistency and uniqueness in identification of digital objects in a web environment is provided by using digital identifiers instead of URL. The main basis of this solution is resolution mechanism that is used in digital identifier systems. Resolution is the use of indirect names instead of URLs; what worked for the DNS (Domain Name System) in stabilizing i...
متن کاملAn assessment of the anatomical variability and contributing factors of female pelvis shape using principal component analysis
Background & aim: Pelvic shape has important effects on obstetrical outcomes. Therefore, this study aimed to determine the etiologic factors that contribute to the formation of female pelvis and describe its variability. Methods: This study was conducted on 131 women referring to Saint Joseph Hospital, Marseille...
متن کاملDevelopment of a cell formation heuristic by considering realistic data using principal component analysis and Taguchi’s method
Over the last four decades of research, numerous cell formation algorithms have been developed and tested, still this research remains of interest to this day. Appropriate manufacturing cells formation is the first step in designing a cellular manufacturing system. In cellular manufacturing, consideration to manufacturing flexibility and productionrelated data is vital for cell formation....
متن کاملAn Empirical Comparison between Grade of Membership and Principal Component Analysis
t is the purpose of this paper to contribute to the discussion initiated byWachter about the parallelism between principal component (PC) and atypological grade of membership (GoM) analysis. The author testedempirically the close relationship between both analysis in a lowdimensional framework comprising up to nine dichotomous variables and twotypologies. Our contribution to the subject is also...
متن کاملOutlier Detection in Wireless Sensor Networks Using Distributed Principal Component Analysis
Detecting anomalies is an important challenge for intrusion detection and fault diagnosis in wireless sensor networks (WSNs). To address the problem of outlier detection in wireless sensor networks, in this paper we present a PCA-based centralized approach and a DPCA-based distributed energy-efficient approach for detecting outliers in sensed data in a WSN. The outliers in sensed data can be ca...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014